Credibility Improves Topical Blog Post Retrieval
نویسندگان
چکیده
Topical blog post retrieval is the task of ranking blog posts with respect to their relevance for a given topic. To improve topical blog post retrieval we incorporate textual credibility indicators in the retrieval process. We consider two groups of indicators: post level (determined using information about individual blog posts only) and blog level (determined using information from the underlying blogs). We describe how to estimate these indicators and how to integrate them into a retrieval approach based on language models. Experiments on the TREC Blog track test set show that both groups of credibility indicators significantly improve retrieval effectiveness; the best performance is achieved when combining them.
منابع مشابه
New Metrics for Newsblog Credibility
The blogosphere is an invaluable source of insight into attitudes towards significant world and local events. Traditional measures of topical relevance, timeliness, specificity and credibility are inadequate when it comes to blogs, however, due to their short length, high degree of quotation, exophoricity, and the short life cycle of blog postings. In this paper, we motivate a novel metric for ...
متن کاملDiversity-based Blog Feed Retrieval
Blog distillation (blog feed retrieval) is a task in blog retrieval where the goal is to rank blogs according to their recurrent relevance to a query topic. One of the main properties of blog feed retrieval is that the unit of retrieval is a collection of documents as opposed to a single document as in other IR tasks. This collection retrieval nature of blog distillation introduces new challeng...
متن کاملThe University of Amsterdam at TREC 2008 Blog , Enterprise , and Relevance Feedback
We describe the participation of the University of Amsterdam’s ILPS group in the blog, enterprise and relevance feedback track at TREC 2008. Our main preliminary conclusions are that estimating mixture weights for external expansion in blog post retrieval is non-trivial and we need more analysis to find out why it works better for blog distillation than for blog post retrieval. For the relevanc...
متن کاملNew metrics for blog mining
Blogs represent an important new arena for knowledge discovery in open source intelligence gathering. Bloggers are a vast network of human (and sometimes non-human) information sources monitoring important local and global events, and other blogs, for items of interest upon which they comment. Increasingly, issues erupt from the blog world and into the real world. In order to monitor blogging a...
متن کاملThe University of Amsterdam at TREC 2008
We describe the participation of the University of Amsterdam’s ILPS group in the blog, enterprise and relevance feedback track at TREC 2008. Our main preliminary conclusions are that estimating mixture weights for external expansion in blog post retrieval is non-trivial and we need more analysis to find out why it works better for blog distillation than for blog post retrieval. For the relevanc...
متن کامل